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Abstract 

Gibbs sampling also known as Glauber dynamics is a popular technique for sampling high dimen- 
sional distributions defined on graphs. Of special interest is the behavior of Gibbs sampling on the 
Erdos-Renyi random graph G(ri, d/n), where each edge is chosen independently with probability d/n 
and d is fixed. While the average degree in G{n, d/n) is d{l — o(l)), it contains many nodes of degree 
of order (log n)/ (log log n) . 

The existence of nodes of almost logarithmic degrees implies that for many natural distributions 
defined on G{n, d/n) such as uniform coloring (with a constant number of colors) or the Ising model 
at any fixed inverse temperature /3, the mixing time of Gibbs sampling is at least ni+^(i/'°gi°g ") with 
high probability. High degree nodes pose a technical challenge in proving polynomial time mixing of 
the dynamics for many models including coloring. Almost all known sufficient conditions in terms of 
number of colors needed for rapid mixing of Gibbs samplers are stated in terms of the maximum degree 
of the underlying graph. 

In this work consider sampling g-colorings and show that for every d < oo there exists q{d) < oo 
such that for all q > q{d) the mixing time of Gibbs sampling on G{n,d/n) is polynomial in n with 
high probability. Our results are the first polynomial time mixing results proven for the coloring model 
on G{n, d/n) for d > 1 where the number of colors does not depend on n. They also provide a rare 
example where one can prove a polynomial time mixing of Gibbs sampler in a situation where the actual 
mixing time is slower than npolylog(n). In previous work we have shown that similar results hold for 
the ferromagnetic Ising model. However, the proof for the Ising model crucially relied on monotonicity 
arguments and the "Weitz tree" both of which have no counterparts in the coloring setting. Our proof 
presented here exploits in novel ways the local treelike structure of Erdos-Renyi random graphs, block 
dynamics, spatial decay properties and coupling arguments. 

Our results give first FPRAS to sample coloring on G{n, d/n) with a constant number of colors. 
They extend to much more general families of graphs which are sparse in some average sense and to 
much more general interactions. In particular, they apply to any graph for which there exists an a > 
such that every vertex v of the graph has a neighborhood N{v) of radius 0(log n) in which the induced 
sub-graph is the union of a tree and at most 0{1) edges and where each simple path T of length 0(log n) 
satisfies X^uer 'l2v^u a'^*-"'"-' = O(logn). The results also generalize to the hard-core model at low 
fugacity and to general models of soft constraints at high temperatures. 
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1 Introduction 



Efficient approximate sampling from Gibbs distributions is a central challenge of randomized algorithms. 
Examples include sampling from the uniform distribution over independent sets of a graph |[23l 1221 171181. 
sampling from the uniform distribution of matchings in a graph |[T5l . or sampling from the uniform distri- 
bution of colorings |[T2l |6l |5l of a graph. A natural family of approximate sampling techniques is given by 
Gibbs samplers, also known as Glauber dynamics. These are reversible Markov chains that have the desired 
distribution as their stationary distribution and where at each step the status of one vertex is updated. It is 
typically easy to establish that the chains will eventually converge to the desired distribution. 

Studying the convergence rate of the dynamics is interesting from both the theoretical computer science and 
the statistical physics perspectives. Approximate convergence in polynomial time, sometimes called rapid 
mixing, is essential in computer science applications. The convergence rate is also of natural interest in the 
physics where the dynamical properties of such distributions are extensively studied, see e.g. flT]. Much 
recent work has been devoted to determining sufficient and necessary conditions for rapid convergence of 
Gibbs samplers. A common feature to most of this work 1(2311221 171181 [T2ll6l [T6l[T8l is that the conditions for 
convergence are stated in terms of the maximal degree of the underlying graph. In particular, these results 
do not allow for the analysis of the mixing rate of Gibbs samplers on the Erdos-Renyi random graph, which 
is sparse on average, but has a small number of denser sub-graphs. In a recent work |fT9l we have shown that 
for any d if < /3 < /3((i) is sufficiently small then Gibbs sampling for the Ising model on on G{n, d/n) 
rapidly mixes. We show that the same result is true in the presence of arbitrary external field. The proofs 
of llT9l crucially rely on the monotonicity of the Ising model and on the "Weitz tree" ll23l which is only 
defined for two spin models. Thus the proof does not apply to models such as the hard-core model or to 
sampling uniform coloring. Other recent work has been invested in showing how to relax statements so that 
they do not involve maximal degrees ||5l[T3l, but the results are not strong enough to imply rapid mixing of 
Gibbs sampling for uniform colorings on G{n, d/n) for d> 1 and 0(1) colors. This is presented as a major 
open problem of both Q and |[T9l . 

In this paper we give the first rapid convergence result of Gibbs samplers for the Ising model on Erdos- 
Renyi random graphs in terms of the average degree and the number of colors only. Our results yields the 
first FPRAS for sampling the coloring distribution in this case. Our results are further extended to more 
general families of graphs that are "tree-like" and "sparse on average". These are graph where every vertex 
has a radius O(logn) neighborhood which is a tree with at most 0(1) edges added and where for each 
simple path F of length 0(log n) it holds that Ylu&v ^v^tu ct'^^^'^^ < 0(Iog n), where a > is some fixed 
parameter. 

Subsequent to completing this work we learned that Spirakis and Efthymiou |i9] independently have also 
produced a scheme for approximately sampling from the random coloring distribution in polynomial time. 
They take a different approach, instead of sampling using MCMC they assign colours to vertices one at 
a time by calculating the conditional marginal distributions making use of the decay in correlation on the 
graph. 

Our arguments extend to prove similar results for many other models. In particular, they give an independent 
proof of rapid mixing for samphng from the Ising model for small inverse temperature /3, the hard-core 
model for small fugacity A and many other models. Note however, that the result presented here for the Ising 
model on general graphs are slightly weaker than the result of flpl. Here we require that each 0(log?i) 
radius neighborhood is a tree union a constant number of edges while in |[T9|| an excess of O(logn) is 
allowed. 
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Below we define the coloring model and Gibbs samplers and state our main result for coloring. Some related 
work and a sketch of the proof are also given as the introduction. Section|2] gives a more detailed proof. 

1.1 Models 

Our results cover a wide range of graph based distributions including the coloring model, the hardcore model 
and any model with soft constraints. 

Definition 1 Let G = {V, E) be a graph and let C be a set of states/colours with \C\ = q. The Hamiltonian 
is a function ^ M of the form 

H{a) = Y,K^{y))+ E 9{<y{u),a{v)) (1) 

u£V {u,v)eE 

where h : C ^ M. is the activity function and g : — f { — oo} is a symmetric interaction function. This 
defines an interacting particle system which is the distribution on a G given by 

where Z is a normalizing constant. We focus our attention on 3 classes of models. 

• The coloring distribution is the uniform distribution over colorings of G with h = and g{x,y) = 
—ool^x=y} the distribution is given by 

{u,v)£E 

• The hardcore model with parameter (3 is the weighted distribution over independents sets of G given 
by C = {0, 1} with h{x) = (3x and g{x, y) = — ool{^_j^=i} and 

= -^exp(/3^ cj(m)) Yl l{a(«)<7(«)=o} (3) 

where a takes values in {0, 1}^ and Z is a normalizing constant. 

• If g does not take the value —oo then we say the model has soft-constraints. This class includes the 
Ising model. 

For U CV we let Pu be the colouring model on the subgraph induced by U. Define the activity free system 
P as the distribution with the activity function h set to 0. The norm of the Hamiltonian is defined 

\\H\\ := max < max max \g{x, ?/)| > . 

1.2 Gibbs Sampling 

The Gibbs sampler is a Markov chain on configurations where a configuration a is updated by choosing a 
vertex v uniformly at random and assigning it a spin according to the Gibbs distribution conditional on the 
spins on G — {v}. 



3 



Definition 2 Given a graph G = (V, E), a set C and a Hamiltonian H as in ([7]), the Gibbs sampler is 
the discrete time Markov chain on where given the current configuration a the next configuration a' is 
obtained by choosing a vertex v in V uniformly at random and 

• Letting a'{w) = a{w)for all w ^ v. 

• cr'{v) is assigned the element x € Af with probability proportional to 

^expj/i(x)+ g{a{w),x)\ . 

where N{v) = {w : {v^w) ^ E} and Z' is a normalization constant. 

Note that in the case of coloring (t'{v) is chosen uniformly from the set C \ {(y{w) : w G N{v)}. 

In the coloring model, it is not completely trivial to find an initial configuration that is a legal coloring. 
However, for G{n, d/n) finding an initial coloring is easy ll2ll . It is well known that with high probability 
if one removes all nodes of large enough degree D'{d) from G(n, d/n) then what remains is a collection 
of unicyclic components. It is easy to color each unicyclic component with 3 colors and therefore color the 
graph with D' + ?> colors. Similar arguments will allow us to find an initial coloring in the more general 
setting discussed here. See ifTOl for a survey of algorithmic results for finding legal coloring in sparse 
random graphs. For the hard-core model and models with soft constraints, it is trivial to find an initial legal 
configuration. 

We will be interested in the time it takes the dynamics to get close to the distributions (O. The mixing time 
Tmix of the chain is defined as the number of steps needed in order to guarantee that the chain, starting from 
an arbitrary state, is within total variation distance (2e)~^ from the stationary distribution. 

1.3 Erdos-Renyi Random Graphs and Other Models of graphs 

The Erdos-Renyi random graph G{n,p), is the graph with n vertices V and random edges E where each 
potential edge {u,v) £ V x V is chosen independently with probability p. We take p = d/n where d > 1 
is fixed. In the case d < 1, it is well known that with high probability all components of G{n,p) are 
unicycUc and of logarithmic size which implies immediately that the dynamics considered here mix in time 
polynomial in n. 

For a vertex v in G(n, d/n) let V{v, I) = {u G G : d{u, v) < I}, the set of vertices within distance I of v, 
let S{v, I) = {u £ G : d{u, v) = I}, let E{v, I) = {{u, w) e G : u,w e V{v, I)} and let B{v, I) be the 
grwph{Viv,l),E{v,l)). 

Our results only require some simple features of the neighborhoods of all vertices in the graph stated in 
terms of t and m below. 

Definition 3 Let G = {V, E) be a graph and v a vertex in G. Let t{G) denote the tree access ofG, i.e., 

t{G) = \E\ - \V\ + 1. 

Forv eV we let t{v, I) = t{B{v, I)). 

We call a path vi,V2, ■ ■ ■ self avoiding if for all i ^ j it holds that Vi ^ vj. 

For a > Owe let the maximal path a- weight of a subgraph H <Z G be defined by 

«er v.u^v^G 

where the maximum is taken over all self-avoiding paths T C H of length at most I. 
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1.4 Our Results 



1.4.1 Colouring Model 

Theorem 1 For all d > 1 there exists q{d) < oo such that for all q > q{d) the following holds. Let G be 
a random graph distributed as G{n, d/n). Then with high probability the mixing time ofGibbs sampling of 
q-colorings is 0{'nP). 

The theorem above may be viewed as a special case of the more general result. 

Theorem 2 For any < a, a, t, 5 < cxo there exists constants q{a, a, t, 6) and C 
ifq > q{a, a, t, 6) and G = {V, E) is any graph on n vertices satisfying 

\/v S t(f,alogn) < t, ma(G, a log n) < 5\ogn, 

then the mixing time of the Gibbs-sampler of q-colorings of G is 0{nP). 

1.4.2 Hardcore Model 

Theorem 3 For all d > 1 there exists (3{d) < oo such that for all (3 < P{d) the following holds. Let G be 
a random graph distributed as G{n, d/n). Then with high probability the mixing time ofGibbs sampling of 
the hardcore model with parameter (3 is 0{nP). 

The theorem above may be viewed as a special case of the more general result. 

Theorem 4 For any < a, a, t, (5 < oo there exists constants (3{a, a, t, 6) and G = C{a, a, t, 5) such that 
if P ^ P{a,a,t,5) and G = {V,E) is any graph on n vertices satisfying 

\/v £ V,t{v,alogn) < t, nia{G, alogn) < 6 log n, (5) 

then the mixing time of the Gibbs-sampler of the hardcore model with parameter [3 is 0{n'~"). 

1.4.3 Soft Constraints 

Theorem 5 For all d > I there exists < H*{d) < oo such that for all models with \\H\\ < H*{d) the 
following holds. Let G be a random graph distributed as G{n, d/n). Then with high probability the mixing 
time ofGibbs sampling of the model is 0{n'~'). 

The theorem above may be viewed as a special case of the more general result. 

Theorem 6 For any < a, a, t , 5 < cxd and all soft constraint models there exists constants H* (a, a,t,5) > 
and G = C{a, a, t, 6) such that if \\H\\ < H*{a, a, t, 6) and G = {V, E) is any graph on n vertices sat- 
isfying 

\/v £ V,t{v,alogn) < t, ma{G, alogn) < 6 log n, (6) 
then the mixing time of the Gibbs-sampler of the model is 0{n'~"). 



G{a, a, t, 6) such that 
(4) 
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1.5 Related Work 



Most results for mixing rates of Gibbs samplers are stated in terms of the maximal degree. Thus for sampling 
uniform colorings, the result are of the form: for every graph where all degrees are at most d if the number of 
colors q satisfies q > q{d) then Gibbs sampling is rapidly mixing ll23l 1221 171181 [121 l6l [161 [TSi . For example, 
it is well known and easy to see that one can take q{d) = 2d. Similarly, results for the Ising model are stated 
in terms of (3 < I3{d). The novelty of the result of |[T9l and the result presented here is that it allows us to 
study graphs where the average degree is small while some degrees may be large. 

Previous attempts at studying this problem for sampling uniform colorings yielded weaker results. In Q 
it is shown that Gibbs sampling rapidly mixes on G{n, d/n) if q = r2(i((log n)") where a < 1 and that a 
variant of the algorithm rapidly mixes if q > r2(i(loglogn/logloglogn). Indeed the main open problem 
of llll is to determine if one can take g to be a function of d only. 

Comparing the results presented here to |[T9l we observe first that there is one sense in which the current 
results are weaker. In |[T9i the tree access t can be of order 0(log n) while for the results presented here t 
has to be of order 0(1). The results of |[T9l crucially use the fact that the Ising model is attractive (this is a 
monotonicity property) and that it is a two spin system which allows using the "Weitz tree" |[23l . 
We note that for all q and all d the mixing time of Gibbs sampling on G{n, d/n) is with high probability 
at least n^+^^^/'°s^°^") >> npolylog(n), see Il5l[l9l for details. It is an important challenge to find the 
critical q = q{d) for rapid mixing. In particular, the question is if the threshold can be formulated in terms 
of the coloring model on a branching process tree with Poisson{d) degree distribution. One would expect 
rapid mixing for in the "uniqueness phase", but perhaps even beyond it, see |[20l[T9l[Tn . 

1.6 Proof Technique 

We briefly sketch the main ideas behind the proof focusing on the special case of coloring. 

Block Dynamics and Path Coupling. The basic idea of the proof is quite standard. It is based on a 
combination of block dynamics, see e.g. (TTl . and path coupling, see e.g. jO, techniques. We wish to divide 
the vertex set V of the graph G into disjoint blocks Vi, . . . , Vk with the following properties: 

• There is at most one edge between any pair of blocks. 

• For each block Vi and any boundary conditions outside the block, the relaxation time of the dynamics 
restricted to Vi is polynomial in n. 

• If we consider the block dynamics, where we pick a vertex v £ V uniformly at random and update 
the block Vi containing it according to the conditional probability onV \Vi, then it has the following 
property: Given two configurations a and r that differ at one vertex v, the updated configurations 
a' and r' may be coupled is such a way that the expected number of differences between them is 
1 - e(l/n). 

The properties above imply a polynomial mixing time for the single site Gibbs-sampling dynamics. 

Block Decomposition : First Attempt. The main task is therefore to show that such a decomposition into 
blocks exists when Q holds and q is large enough. A key concept in the construction of the blocks is the 
notion of good vertices. Roughly speaking the blocks are constructed in such a way that the boundary of 
each block consists of good vertices only. 
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Good vertices v are vertices that are of degree bounded by c and such that 



< e. (7) 

A nice feature of this definition is that it is easy to see that if all the vertices at a boundary of a block V 
satisfy O then any vertex inside the block satisfies the same inequality with instead of a. 
Assume for a moment that all blocks constructed are trees. In this case (|7]) implies that for a large enough q 
and given two boundary conditions that differ at one site, it is possible to couple the configurations inside the 
block with expected hamming distance e. Moreover, the case where all the blocks are trees, we show that 
the second condition in @ together with the small effect of the boundary implies a polynomial relaxation 
time of the dynamics inside the block. 

Cyclic components and skeletons. More work is needed since we may not assume that all blocks are 
trees. In fact, a crucial step of the construction is to show that there are components Wi , . . . , Wr that 
contain all cycles of length O(logn) and such that all degrees in Wi are bounded, the size of each Wi is 
O(logn) and the distance between Wi and Wj is J7(log?i). All of the properties above follow from the 
assumption on the tree excess. We call the components Wi the skeletons. 

Given the skeletons Wi, we consider two types of blocks: tree blocks and the blocks consisting of Wi and 
trees attaching to Wi. Using (HI) we show that the mixing time of each block is polynomial in n and that the 
effect of the boundary on each block is small. This allows to deduce a polynomial mixing time bound. 

2 Proofs 

2.1 Proof of Theorems [1] , S and g] 

Proof: (Theorem [T|3 151 ) The proofs follows by by Lemma[T]below and Theorems [2l|4] and [6] respectively. ■ 

Lemma 1 For every d>l there exist < o,a,t,(5 < cxd such ifG is a random graph distributed according 
to G{n, d/n) then with high probability ma{G, a log n) < 6 log n and for all v £ V, t{v, a log n) < t. 

Proof: It is well known that G{n, d/n) satisfies 2a log n) < 1 for all v with high probability, provided 
that a = a{d) > is sufficiently small, see, e.g., |[T9ll . Next we show that if a is sufficiently small then with 
high probability for all vq and all F, a self-avoiding path of length a log n starting at the vertex vq, it holds 
that 

Considering the contribution to the sum from u ^ B{v,2a log n) we see that 

E(^)-E E a'^("''') + (alogn) xnxa'^'°s". 

uGT v:uy^vSiB{vo ,2a log n.) 

Note that (a log n) x n x = o(l) if q > is small enough so that a log a + 1 < 0. In order to bound 

the first sum we note that 

2a log 71 

^ ^ a'^("''^) < ^ \{ueT:d{v,u) = D}\. 

u(^r v:Uy^v£B{vo,2alogn) D=l DGiJ(iio ,2a log n) 
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Note that for each v G B{vo,2alogn) the size of the set {n G F : d{v,u) = D} is at most 4. Indeed 
suppose that there are five elements ui , . . . , M5 in this set. For Uj denote by the last point on F on a 
shortest path from Ui to v and wi be the following point. Since F is a path it follows that the size of the 
set {u'^ : 1 < i < 5} is at least 3. Without loss of generality assume that u[,u'2 and Ug are distinct. 
Then removing the edges {ui,wi) and , W2 ) will maitain the connectivity properties of B{vQ,2a log n) 
contradicting the fact that t{vQ, 2a log n) < 1. The argument above implies that 

2a log n 2a log n 

^ \{ueT:d{v,u) = D}\<4 ^ a^\{v £ B{vo,2alogn) : d{v,T) < D}\. 

D=l v£B(vo,2a\ogn) D=l 

We now use the well known expansion bounds implying that in G{n, d/n) with high probability all con- 
nected sets F of size at least a log n have at most h^\T \ elements at distance at most D from F which allows 
to bound the last sum as 

2a log n ^ 

4a log n a^h^ < — logn, 

D=i ^ 

provided a is small enough. Finally, we recall the proof of the expansion bound. Note that it suffices to show 
that for all connected sets F of size at least a log n, the number of elements at distance exactly 1 from the 
set is bounded by (/i — 1)|F|. By a first moment calculation, the probability that a set with more neighbors 
exists is bounded by: 

n\ . I d\ 



P[Bin{s{n-s),d/n)>{h-l)s] 



s=a\ogn 

n 

s=a log n 



< V nd'^^PlBinisn, d/n) > {h - l)s] = o(l), 



provided h is large enough since by standard large deviation results, 

P[Bin{sn, d/n) > (h — l)s] < E exp{Bin{sn, d/n) — (h — l)s) 

= (1 + ^ii:ill)-exp(-(/i - l)s) 
n 

< exp {s[d{e - 1) - {h - 1)])) . 

■ 

2.2 Notation 

Definition 4 Let dU denote the interior boundary ofU: 

dU = {ueU :3u £ s.t. (u , u) G E}. 
Let d'^U denote the exterior boundary ofU: 

d+U = {n G : 3u' G t/ s.t. (u', u) G E} 
For U (^W QV denote the exterior boundary of W with respect to U: 

d^U = {-u G VF" : 3n' G [/ s.t. {u', u) G E}. 

If T is a tree rooted at p and u £ T then we let denote the subtree of u and all its descendants. Let 
denote T„ U d^T^. 
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Definition 5 Define the a-weight of a vertex v by (pa{v) = X^u^j; ct'^^^'^^- We call v a (c, a, e)-good vertex 
if the degree of v is less than or equal to c and ^aiv) ^ £■ If v is not a (c, a, e)-good vertex then it is a 
(c, a, e)-bad vertex. When there is no ambiguity in the parameters (c, a,€) we will simply call vertices good 
or bad vertices. 



2.3 Relaxation and Mixing Times 

Although not necessary for our results, to make use of existing theory it is convenient to make the assumption 
that the Gibbs sampling is lazy, that is we introduce self-loop probability of a half for all states. It is well 
known that Gibbs sampling is a reversible Markov chain with stationary distribution P. Let 1 = Ai > 
A2 > ... > Am > — 1 denote the eigenvalues of the transition matrix of Gibbs sampling. The spectral 
gap is denoted by max{l — A2, 1 — |Am|} and the relaxation time r is the inverse of the spectral gap. The 
relaxation time can be given in terms of the Dirichlet form of the Markov chain by the equation 

^ -„J 2E,-P(»)(/(a))' 1 ,„ 

"^"'■lE,,.n-.r)(/M-/(r)p| 

where / is any function on configurations, P{a, r) = P{a)P{a —>■ r) and P{a —>■ r) is transition proba- 
bility from a to r. We use the result that the for reversible Markov chains the relaxation time satisfies 

r < W <r(^l + ^ log(mmP(a))-i j (9) 

where Tmix is the mixing time (see e.g. HI). In all our examples we have log(minCT P{a))~^ = poly(n) so 
by bounding the relaxation time we can bound the mixing time up to a polynomial factor. 



For our proofs it will be useful to use the notion of block dynamics. The Gibbs sampler can be generalized 
to update blocks of vertices rather than individual vertices. For blocks Fi, V2, . . . , C V, not necessarily 
disjoint, with V = UiVi the block dynamics of the Gibbs sampler updates a configuration a by choosing a 
block Vi uniformly at random and assigning the spins in Vi according to the Gibbs distribution conditional 
on the spins on G — {Vi}. The relaxation time of the Gibbs sampler can be given in terms of the relaxation 
time of the block dynamics and the relaxation times of the Gibbs sampler on the blocks. 

Proposition 1 If r^iiock is the relaxation time of the block dynamics and Ti is the maximum the relaxation 
time on Vi given any boundary condition from G — {Vi} then by Proposition 3.4 of I117\I 

T < TMocfc(maxri) max{#j : v G Vj}. (10) 



2.3.1 Canonical Paths and Conductance 

We will use the following conductance result which follows from Cheeger's inequality, see e.g., |[T4l . 

Proposition 2 Consider an ergodic reversible Markov chain Xi on a discrete space Q where for any two 
states a,b Q such that P{a, b) := P{a)P[a ^ 6) > it holds that P{a, b) > e. Then 

Tmix ^ 2/ £ . 
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Proposition 3 Suppose that for any two states a, rj in the state space we have a canonical path J(a,r]) = 
(cj = a^'^\ cr^^\ . . . , a^^^ = rj) such that each transitions satisfies P{a^'^\ cr^*"''^^) > 0. Let L be the length 
of the longest canonical path between two states and let 

p= sup 2^ 



(^''''")(,7,r?):(r,',r,")G7(<.,^) 



P{ri' , rj' 



.11 \ 



where the supremum is over pairs of states rj', rj" with P{rj' , rj") > while the sum is over all pairs of states. 
Then the relaxation time satisfies 

T < Lp. 



2.3.2 Patli Coupling 

We use the path coupUng technique Q to bound the relaxation time. The proposition below follows from 131 
and H, see also 121. For two configurations a, a' G we denote their Hamming distance by dni^r, c') = 
\{v:a{v)^a'{v)}\. 

Proposition 4 Consider Gibbs sampling on a graph G. Suppose that for any pair of configurations ai, 02 
that differ in one site only, there is a way to couple the dynamics such that ifa'i and a'2 denote the configu- 
ration after the update then: 

Then 



2.4 Block mixing 

For the proof we will consider block dynamics where the blocks are in some sense weakly connected. We 
will bound the relaxation time of the block dynamics in terms of single site dynamics of the sites connecting 
the blocks as follows. 

Lemma 2 Let P be any Gibbs measure taking values in C. Let U C V and fix some boundary condition 
rj on d'^U. Suppose that U is the disjoint union of subsets Ui. Further suppose that for all i there exist 
Wi G Ui such that there are no edges between U — Ui and Ui — {wi}. Let W = Uj{wj}. Let Bi = d^Ui 
and let 

Pw,{x) = Pu^uB,{cr{wi) = x\a{Bi) = r]{Bi)). (11) 
We define the distribution Q on by 

Qi'^iw)) = ^Pwi<Tiw))llp^MyJ^)) (12) 

i 

where P is the activity free distribution from Definition [7] Then the relaxation time tq of Gibbs sampling 
for Q satisfies mock < max(|VF|, tq). 

Proof: Let P^ denote the probability measure on U with boundary conditions rj. Then by the Markov 
property and ([121 ) it follows that P^y = Q. We note furthermore that from the Markov property it follows 
that the measure P^ satisfies for any i: 

P'\a{B.{) = a'\a{U \ Bi) = a") = Q{a{wi) = a'{wi)\a{W \ {wi}) = a" {W \ {w.i})) 

X P\a[B, \ {wi}) = a'{B, \ {wi})\a{w,) = a'{wi)). (13) 
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Write fjf for the state of the block dynamics with blocks Bi and boundary conditions rj. Write a[ for the 
state of the single site dynamics for (fT2l ). Then assuming that we have cro(VF) = (Tq we obtain by equation 
( [T3] ) that the dynamics on a and a' may be coupled in such a way that for all t: 

. at{W) = g[. 

• If all the blocks (sites) in at {cr') have been updated at least once then: 

P{at = a*\at{W) = a**) = P'^ia = a*\a{W) = a**). 

Note that the probability that at least one block has not been updated by time t is at most | VF| (1 — 1/| 
Let P* denote the distribution of at and similarly Q*. Given an optimal coupling between and Q consider 
the coupling of P* to P where given two configurations {a'l, (T2) distributed according to the coupling, we 
let ai be distributed according to the conditional distribution given a'^ and similarly for a2. Moreover by 
the argument above it follows that we may define ai and a2 is such a way that if a'i{W) = a'2{W) and all 
blocks have been updated then ai = a2- This implies that 

dTv{P\P'') < dTv{Q\Q) + \W\{l - 

Since the relaxation time measures the exponential rate of convergence to the distribution we conclude that 
niock < max(|iy|,rQ). ■ 

Our bounds on the relaxations times of trees will be given in terms of their path density defined as follow 
Definition 6 For a tree T C G rooted at p we let the maximal path density be defined by 

m{T, p) = max deg{u) 

where the maximum is taken over all self-avoiding paths F C T starting at p. 
2.4.1 Colouring Model 

Next we prove two lemmas which will be used together with Lemma[2]to prove relaxation bounds below. 
Lemma 3 Let W be a star with center v and k leaves. Let 

Q{<W)) = }^Pw[a{W)) n Vu,{c7{w)) 

where the are functions such that for all w G W, "^xgC Pwi^) — 1 and for all w W, x (z C either 
Pw{x) > (qS)"^ or pyjix) = 0. Further assume that for some c<q — 3we have that for all w G W — v, 
i^{x S C : Pw{x) = 0} < c. Then the relaxation time r of the Glauber dynamics on Q is at most where 
C is a constant depending only on c, (5, q. 

Proof: We first show that the chain is ergodic by constructing a path between any two configurations a 
and 1] with Q{a) and Q{ri) > 0. Since for each leaf w there are at least 3 colours x with Pw{x) > we 
can find a colour x{w) such that p^{x{w)) > and a{v) ^ x{w) 7^ rjiv)- The path is constructed by 
changing the states of the leaves to x{u), then changing the state of v to ri{v), then finally changing the 
states of the leaves to ri{u). Now by the hypothesis there are at most q^'^^ colourings of so Z < q^^"^ so 
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we have that Q{(j), Q{r]) > For two adjacent states a and a' with Q{(t), Q{(t') > 0, we have 

Q(cr a') > {q6{k + and so Q{a,a') > {q^5)~'^^^^\q6{k + l))'^. From Proposition |2] it now 

follows that 

T2 < {{q6{k + l)?{q^S)'+')'' < A''q^0k^20k^ 

as needed. ■ 

Similarly, it is easy to see that 

Lemma 4 Let W be a graph with k vertices of maximum degree d. Let 

where the are functions such that for all w S W, "^x&cPwi^) — ^ and for all w S W,x C either 
Pw{x) > {qd)~^ or Pw{x) = 0. Further, for some c < q — d — 2 we have that for all w G W, #{x G 
C : Pw{x) = 0} < c. Then the relaxation time of the Glauber dynamics on Q is at most where C is a 
constant depending only on c, 6, d and q. 

We can now obtain polynomial mixing time results for the type of blocks that will be used in the construction. 



Theorem 7 Let T (1 U d V such that T is a tree rooted at p and so that there are no edges between 
T — {p} and U — T. Suppose that for all u ^T, ^{v (^V — U:{v,u)£ E} < c and that for each u £ T, 

P^^{a{u) = x\a{d^T)) 

sup sup sup JD / I \ 1 /a+^NN (14) 

-(a+T„)-eCy6C:P , (aH=yk(a+T„))^o ^T+(^(^) " yW^^^^T)) 

For some I > 1 assume there are at most I edges between {p} and U — T. Let r be the relaxation time 
of the Glauber dynamics on T. Ifq>c + l + 2 then for any boundary condition i] on d^T we have that 
T < C'^'^'^'P^ where m{T, p) is the maximal path density on T and where C is a constant depending only on 
c, 6, q and I. 

Proof: We proceed by induction on m(T, p). If T is a single point then r = 1 and so r < C""('^'''). Now 
suppose p has children ui, . . . , Ufc G T. By induction the relaxation time of the Glauber dynamics on T^^, 
Tj < C'^Cj^ui'"*) and by the definition of the maximal path density m(r„.,Mj) < m{T,p) — k. Let Tbiock 
denote the block dynamics on T with blocks {{p}, T„j , . . . , T^^. }. Applying Lemma [2] and [3] we get that the 
block dynamics satisfies uiock < C''- Then by Proposition 3.4 of ifTTl we have that 

i 

which completes the result. ■ 



2.4.2 Hardcore Model 

Lemma 5 Let W be a graph and let 

where the Pw are functions such that for some 5 and all w G W, 5 < p«,(0) < 1 and Pw{0) + = 1. 

Then the relaxation time r of the Glauber dynamics of Q satisfies t < Cl'^l where C depends only on (3 
and 5. 
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Proof: We use the method of canonical paths from Proposition |3] Let a and r] be two configurations with 
Q{a) and Q{7]) > 0. We define the canonical path to be a path which begins from a, then sequentially 
changes states of all the vertices to and then sequentially changes the state of w € to 1 if r]{w) = 1. 
Now suppose 7]', 7]" is a step in some path. Clearly each path is of length at most 2\W\. They must differ 
at exactly one site w G W and suppose that r]'{w) = 1 and r]"{w) = 0. If (ry', rj") is in the canonical path 
7(0-^) then a > rj' under the canonical partial ordering. Now P[ri' t]"] = ^^0'^ > -pj^. Then 



< ^((l + exp(max(/5,0))ri)l'^l. 



Similarly the same bound holds for pairs with ri'{w) = and r]"{w) = 1 so p 
From Proposition |3] it now follows that 

9| 

T2 < ^^((l + exp(max(/3,0))r 1)1^1 < IQI'^I exp(max(/3, 0)|Vr|)<5-l'^l , 


as needed. ■ 

Theorem 8 Let T C V be a tree rooted at p. Then r < C"*^"^'''^ where m{T, p) is the maximal path density 
on T and where C is a constant depending only on f3. 

Proof: We proceed by induction on m{T, p). If T is a single point then r = 1 and so r < C^C^'Pl Now 
suppose p has children ui, . . . , Ufc G T. By induction the relaxation time of the Glauber dynamics on T„. 
satisfies Tj < C^C^^i'"*). By definition of the maximal path density m(T„. , Uj) < m{T, p) — k. Let Tuock 
denote the block dynamics on T with blocks {{p}, Tu^ Tu^. }. We define the distribution Q on by 



z 

andpiL,. is as in equation (fTTI ). Applying Lemma|2]with W = {p,ui, . . . , ujS\ implies that Tuock < max(A; + 
1, rg) where tq is the relaxation time of the Glauber dynamics on the measure Q. In the hardcore model for 
any vertex v and any boundary condition (t{V — {v}) onV — {v} we have that P{a{v) = 0\a{V — {v})) > 
Y^^, the probability that the spin at u is given that the spins of all its neighbors are 0, and so each 

Pw{0) ^ i^^i3 ■ It follows that in Lemma[5]we can take 6 = so Thiock < max(/c + 1, C{^^^) < C'^ 

for sufficiently large C. Then by Proposition 3.4 of ifTTl we have that 

i 

which completes the result. ■ 
2.4.3 Soft constraint Models 

For soft constraint models, bounding the mixing time is simplified by the fact that removing an edge adds at 
most a constant multiplicative factor to the relaxation time. 
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Theorem 9 Let r be the relaxation time of the Glauber dynamics on a tree T <Z V. Given arbitrary 
boundary conditions, 

T < exp(4||i?||m(r)) 
where \\H\\ is the norm of the Hamiltonian. 

Proof: 

We proceed by induction on m with a similar argument to the one used in |fT9l for the Ising model. Note 
that if m = the claim holds true since r = 1. For the general case, let v be the root of T, and denote 
its children by ui, . . . , Ujt and denote the subtree of the descendants of ui by T*. Now let T' be the tree 
obtained by removing the k edges from v to the Uj, let P' be the model on T' and let r' be the relaxation 
time on T' . By equation ([8]) we have that 

mm^,^P(o-,r)/P'(fj, r) 

Now we divide T' into k + 1 blocks {{v},T^ T^}. Since these blocks are not connected to each other 
the mixing time of the block dynamics is simply 1. By applying Proposition 3.4 of [TT\ we get that the 
relaxation time on T' is simply the maximum of the relaxation times on the blocks, 

T < max{l, r*}. 

where r* is the relaxation time on T*. Note that by the definition of m, it follows that the value of m for each 
of the subtrees T* satisfies ?ti(T*) < m — k, and therefore for all i it holds that r* < exp(4||i7||(m — k)). 
This then implies by (fTSl ) that r < exp(4||ii'||m) as needed. ■ 

2.5 Correlation Decay in Tree Blocks 

In this subsection we prove that if we look at a tree block, all of whose leaves are good, then for large enough 
q we have the correlation decay property (fT4l) . 

Definition 7 For < A < 1 and U CV define the block boundary weighting as the function defined by: 

w&d+U 

for all V U. 

Lemma 6 If every vertex in d~^U is (c, a, e)-good then for all A < a^, 

, / N eA 
^{v) < — 

Proof: Let v ^ U and let u ^ d'^Ubean exterior boundary vertex which minimizes the distance to v. Then 
and the result follows since for A < we have i^xiv) < -^ipa^ (v). ■ 
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2.5.1 Colouring 

Lemma 7 Suppose that T = (Vr, Et) is an induced subgraph of G = (y, E) that is a tree and suppose 
that for all v G Vr, < 1- Then there exists a q depending only on A such that for all v € Vr-' 

P{a{v) = x\a{d+T)) ^ , , , 
sup sup sup 1 < exp(V'(w)) (17) 

(d+T) xeC yeC:P{a{v)=y\a{d+T))^0 ^{(^{V) - y\(^{0^ ^ )) 



a 



where the supremum is over all boundary conditions a{d'^T) on d^T. 
Proof: 

Fix V as the root of the tree. We will prove the result by induction on the size of the tree. When the tree 
consists of a single vertex v the quantity in the left hand side of ([TT] ) is clearly 1. 

Let ui, . . . , u/ be the children of v in T. Consider the graph G' = {V , E') obtained from G by removing 
the vertex v and all adjacent edges. Let 

Pj.+ {a{ui) = x|<T((9^r„.)) 
6i = sup sup sup — — ■ — — -j: — - (18) 

For lu' € Tu- write i/ji{w') = ^^^g+rp X^i'^^'^') _ Note that ipi is the function V' for the subtree Tu^ in the 
graph G' . Note moreover that for all w we have ipi{w) < ip{w). By the induction hypothesis we therefore 
have 5i < ex.p{il)i{ui)). Let di = #{w e V \ Tu^ : {w,Ui) G E} and note that there are at least q — di 
elements y G C with Prp+ {(t{v) = y|(T(5"^T„.)) > so 

min{P^+ {a{v) = y\a{d+T^^)) : {a{v) = y\a{d+T^J) > 0} < 
y "-i q — di 

and so by dTSl ) we have 

maxPj,+ {a{v) = y\a{d+Tu^)) < — ^. (19) 

y 'i' Q di 
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Since d-iX < tpi{ui) < 1, taking q > 2/A yields q — di > q/2. When < x < 1 we have — 1 < 2x so 
Si — 1 < 2'ip{x). And since is increasing in x 

1 - {a{v) = x\a{d+Tu,)) {a{v) = y\a{d+T^^)) - {a{v) = x\a{d+T^,)) 

sup j — = 1 + sup ■ 



1 - {a{v) = yk(9+T„J) ' " 1 - ia{v) = y\a{d+Tu,)) 

1 H ^— — (By ( [T9l ) and since is increasing) 

1 - -TT^ 



q-di 

q- di- 5i 

< 1 + -^^=0} (since 5i < e and q-di> q/2) 

q/2 - e 

< 1 + ~ ^^'^'=°>^ (taking q > 4e) 

Q 

< 1 + ^^-(^-^+^^- (since 5. - 1 < 2i.{x)) 

Q 

Sipiim) + Adi 

< exp( ) 



where the supremum is taken over all x,y ^ C and boundary conditions on d^Tu- Now note ) > 
A ipi{ui) (it may be strictly greater due to the contribution of the neighbors of v outside T). Therefore: 

P{a{v) = x\a{d+T)) yj 1 - Pt^^ {<t{v) = x\a{d+TuJ) 
sup sup sup — - — 1 — -— - — -- = sup — j — 

a{d+T) x&C yeC:P{aiv)=y\aid+T))j^O P{(^{v) = y\a{d+T)) ^ ^ " ^T^, (^(^) = ^^(^^^"j) 

8lpi{Ui)+Adi 

< exp( ) 

Q 

which completes the induction provided that q is large enough so that q > max(4e, j + p-). ■ 
The following corollary follows immediately from Lemma|7]and Lemma|6] 

Corollary 1 For all c, a > and e > there exists a qfor which the following holds. Let T C V be a tree 
such that every vertex in d^T is (c, a, e)-good. Then for any < A < 1 there exists a q such that 

a{d+T) x&C y(iC:P{a(v)=y\a(d+T))^0 ^y<^\V) " j j wtd+T 

where the supremum is over all boundary conditions a{d'^U) on d'^U. 
2.5.2 Hardcore model 

Lemma 8 Suppose that T = {Vr, Et) is an induced subgraph of G = {V, E) that is a tree. For v Vr 
and T] a boundary condition on d~^T let P^ denote the measure P[a[v) = ■\a[d'^U)). Then if f3\ = log A 
then for all (3 < I3x and v G Vr-' 

dTv{P'^\P''")<Mv) (20) 
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for any two boundary conditions rj^ and if on d~^T where cLtv the total variation distance. 

Proof: Since the left hand side of equation ( [201 ) is bounded by 1 we can assume that 4'{v) < 1. Fix v as the 
root of the tree. We will prove the result by induction on the size of the tree. Let ui , . . . , be the children 
of V in [/ and let ^wi, . . . , Wm be the children of v in d^T. Consider the graph G' = {V, E') obtained from 
G by removing the vertex v and all adjacent edges and let PJl denote P'{a{ui) = ■\vi). Then 

dTv{P''\P''') = \Picjiv) = 0\ri') - P{a{v) = 0\ri^)\ 

1 1 



< 



1 + nLi < (0) nr=i i + nLi < (o) nr=i hvi„.,} 

I m I m 

n< (0)11 -n<(o)ni{<.o: 



i=l 



nU<.(o)-nU<(o) 



1=1 

m > 1 
m = 



(21) 



Now if m > 1 then il}{v) > A so dTv{P^^ > P^'^) < i'iv)- This establishes equation (|20l ) for trees of size 1. 
We now proceed by induction. 

Observe the simple inequality that if < xi, . . . , < 1 and < yi, . . . , < 1 then 



!=1 



1=1 



i-1 



^{xj-yj)J[xi JJ yi 

j=i 1=1 i=j+i 



<Y.\xj-yj\. 
Applying equation (l22l ) to equation (|2TI ) we get that when m = 0, 



(22) 



dMP"' , ^^') < E l< (0) - < (0)1- 



i=l 



By the inductive hypothesis applied to the tree T^^ we have that 



so 



'Jed+Tu 



dMP''" , P^') < E l< (0) - K I ^ ^(^) 
which completes the induction. ■ 



1=1 



2.5.3 Soft constraint models 

Lemma 9 Suppose that T = (Vy, ^^t) is an induced subgraph of G = {V, E) that is a tree. For v (z Vr 
and T] a boundary condition on d'^T let denote the the measure P{a{v) = ■\(T[d'^U)). Then there exists 
an H'^ > depending only on A such that if\\H\\ < H'^ and v G Vt: 

dTviP'^\P''')<Mv) (23) 
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for any two boundary conditions r]^ and rj^ on d~^T where cLtv the total variation distance. 

Proof: Since the left hand side of equation (1231 ) is bounded by 1 we can assume that ip{v) < 1. Let 
K = 4(ell^ll - e~ll^ll). We can take to be small enough so that 4K < X and for < a; < 1/A we have 
exp{—xK) < 1 — xK/2 and exp(2i^x) < 1 + 4Kx. Fix v as the root of the tree. We will prove the result 
by induction on the size of the tree. Let ui, . . . , be the children of V in U and let ui^i, • • • > be the 
children of v in d^T. Consider the graph G' = {V, E') obtained from G by removing the vertex v and all 
adjacent edges, let P' denote the induced soft constraint model on G' and let PJl denote P'{a{ui) = -Irj). 
Then for all i and z £ C, 



>eM-KdTv{Pt.PTj) 



r 



Similarly we have 



Ey^ece'^'''^'>Piiy^) 



<exp{KdTv{Pl.,Pl.)) 



Then for each x € C, 



(-)(^) e'^(^) nr=i E,..c e^^^^y^)pl^ im) E..c e'^^'^ UT=i Ey^.c e^(^'^^)< iy^) 

e'^(^) nr=i Ey.ec e^(^'^»)< iy^) E.ec e^(^) UT=i Ey.eC e^(^'^')< iy^) 
<exp (2Kf2dTv{Pi^,P^l)]. 



i=l 



Then 



P^i {x) 



x&C ^ ' 

/ m 

<e^v[2KY,dTv{Pt^Pt)] -1 



i=l 



Now suppose that T is a single vertex {v} so ui, . . . Um are all in d^T and so ip{v) = m\. If m = then 

dTv{P''\P''^) = ^{v) = 0. If 1 < ?n < 1/Athen 



dTviP"^ ^P'' )< exp {2Km)) - 1 < 4Km < Am = V(t') 
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while if m > 1/A then tpiv) > 1. So this verifies the case when T is a single point. For the induction step 
our inductive hypothesis says that 



which completes the induction. ■ 
2.6 Block Construction 

Lemma 10 For two (c, a, e)-bad points u, u' we define u ^ u' if there is a path u = ui,U2, ■ ■ ■ ,Uk = u' 
such that no two consecutive vertices on the path Ui,Ui^i are {c,a,€)-good. Then ~ is an equivalence 
relation of (c, a, e)-bad vertices in G. 

Proof: The relation is clearly reflexive and symmetric. Suppose that there is a path u u' and u ~ u" . 
Then there exist paths u = vi,V2, . . . ,Vk = u' and u = wi^W2, ■ ■ ■ ,wi = u" such that no two consecutive 
vertices are (c, a, e)-good. Let i = max(j : vj € {wi,W2, ■ ■ ■ iWi}) and suppose that Vi = Wj. Then 
the path u' = Vk, ffc-i, . . . , Vi,Wj+i,Wj+2, . . . ,wi = u" is a path with no two consecutive (c, a, e)-good 
vertices so u' ~ u". Hence ~ is transitive and is an equivalence relation. ■ 

We now describe our method for partitioning G into smaller blocks for some fixed (c, a, e). 

• Two (c, a, e)-bad points u, u' are in the same block if and only if n ~ u' . 

• A (c, a, e)-good vertex is in the same block as any bad point it is adjacent to. 

• A (c, a, e)-good vertex not adjacent to any bad point forms a separate block 

By Lemma [TOl the first point defines a partition of the (c, a, e)-bad vertices. If a good vertex v is adjacent to 
bad vertices ui and U2 then ui , f , U2 has no two consecutive good points so ui ~ U2 and hence good points 
are assigned to exactly one block. Hence this defines a partition of G into blocks whose boundaries are all 
(c, a, e)-good. We will abuse notation and let ^ denote the equivalence relation on all G for this partition. 

Lemma 11 Suppose that G satisfies equation Then for any < L < oo there exists (c, a, e) such that 
every self-avoiding path ui,U2, ■ ■ ■ ,ULiogn contains two consecutive (c, a, e)-good vertices Ui, Uj+i. 

Proof: We can assume that L < a and set e = ^. Then since "^^=1^^ y^a{ui) < 5 log n at most ^ log n of 
the Ui have (pa{ui) > £■ If c = ^ then if (pa{ui) < e then 




If i}{v)<l then YT=i dTviPr ^P? and so 





a 




a 



u:{u,Ui)£E 



SO Ui is (c, a, e)-good. Since the path ui, M2, tiL log n contains at least |Llog n (c, a, e)-good vertices it 
must contain two consecutive good vertices. ■ 



The following corollary is immediate from the definition of the equivalence relation. 
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Corollary 2 Suppose that G satisfies equation (|4]l. Then for any < L < oo there exists (c, a, e) such that 
ifu ~ V then d{u,v) < Llogn. 

Our next step is to define a partition of the graph into blocks whose boundaries ai^e good vertices and such 
that each block is either a tree or a tree plus some bounded number of edges. The decomposition into blocks 
relies on the following combinatorial lemma. 

Lemma 12 Consider a graph G = (F, E) where V is the disjoint union ofVc and Vb- Assume fitrther that 
for all V G V it holds that t{v,alogn) < t and that every self avoiding path ui, ... ,U]^iogn contains two 
consecutive elements in Vg, where (20t + 2)L < a. Then we can partition G into blocks {Vj} such there is at 
most one edge between any two blocks. Moreover, for all j, the diameter ofVj is less than {20t + 2)L log n, 
it holds that dVj C Vg, and Vj satisfies one of the following 

• It is a tree. 

• There exist vertices Wi and disjoint subsets Ui C Vj such that each Ui is a tree of depth at most 
2L log n, Vj = UjC/j and wi G Ui, there are no edges between Ui — Wi and Vj — Ui. Furthermore the 
distance between dVj and Wj = UiWi is at least L log n and the subgraph Wj has \ Wj \ < 20tL log n 
and largest degree at most 2t. 

Corollary 3 Suppose that G satisfies equation Then there exists < L < oo and (c, a, e) such that we 
can partition G into blocks {Vj} such there is at most one edge between any two blocks. Moreover, for all j, 
the diameter ofVj is less than (20t + 2)L log n, it holds that dVj C Vg, and Vj satisfies one of the following 

• It is a tree. 

• There exist vertices Wi and disjoint subsets Ui C Vj such that each Ui is a tree of depth at most 
2L log n, Vj = UiUi and Wi G Ui, there are no edges between Ui — Wi and Vj — Ui. Furthermore the 
distance between dVj and Wj = UiWi is at least L log n and the subgraph Wj has \ Wj \ < 20tL log n 
and largest degree at most 2t. 

Proof: Letting Vg be the set of good vertices and Vb the set of bad vertices, the proof of the corollary 
follows from Lemma[T2]by taking L such that {20t + 2)L < a and choosing (c, a, e) according to Corollary 

m ■ 

We now prove Lemma [121 

Proof: The first step of the proof will be the construction of W = UWj C V. Beginning with W as the 
empty set we can add to W in three ways: 

• If ui, n2, . . . , Um is a self-avoiding path of vertices inV — W such that ui and Um are adjacent and 
3 < m < 5L log n then add {ui ,U2, ■ ■ ■ , Um} to W. 

• If ni, M2, . . . , Um is a self-avoiding path inV — W such that both ui and Um are adjacent to W and 
2 < m < 5L log n then add {ui ,U2, . . . , u^} to W. 

• If ui is adjacent to two vertices in W then add {ui} to W. 
The construction of W ends when no more additions are possible. 
Claim 2.1 W does not depend on the order of the additions. 
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Proof: Note that if W' and W" are two different W's obtained for different order of additions then one may 
add all elements in W \ W" to W and vice-versa. ■ 

Claim 2.2 At each stage of the construction no connected component Wj of W is a tree; each connected 
component Wj of W has 

\Wj\ < {WLt{Wj) - 5L)logn, 

where t{Wj) is the tree excess ofWj. 

Proof: We split the additions into three cases. If ui,U2, ■ ■ ■ , Um is not adjacent to any component of W 
then this creates a new component Wnew of W. This must be achieved by an addition of the first type. The 
new component must contain a loop and have tree excess at least 1 and | Wnew \ is less than 5L log 7i which 
is less than {10Lt{Wnew) — 5L) logn. 

Next suppose that an addition ui, n2, . . . , Um is adjacent to exactly one existing component Woid of W. 
Then the addition forms a new component Wnew which contains a new loop so t{Wnew) > t{Woid) + 1- On 
the other hand 

\Wnew\ < {10Lt{Wold) - 5L + 5L) logn < {10Lt{Wnew) - 5L) log?!. 

Finally the addition ui, U2, . . . , may be adjacent to two or more components Wi, . . . , Wk of W and so 
forms one new component Wnew from these. Then t(Wnew) > Z]j=i 

\Wnew\ < 5Llogn + Y,\Wj\ < {WLt{Wnew) -5L)logn. 



Claim 2.3 When the construction ofW is completed, each component Wj ofW is of size at most 20tL log n 
and tree excess at most t. The distance between two components of W is at least 5L log n. All the degrees 
in W are bounded between 1 and 2t. 

Proof: We have seen that at each of the additions the tree excess of a component increases by at least one. 
Suppose one of the components of W satisfies \Wj\ > 20tLlogn. If at some point in the construction 
the maximum diameter of a component is D then after an addition the new maximum diameter is at most 
2D + 5L log n. So at some point in the construction there must have been a component Wj with 

5 

(lot - -)Llogn < \Wj\ < 20tLlogn. 

Let V £Wj. Then Wj C B{v, 20tL log 7i) so t{Wj) < t{v, 20tL log 7i) < t. Then 

\Wj \ < {10Lt{Wj) - 5L)logn < {Wt - 5)Llogn, 

which is a contradiction. Hence every component of W has size at most 20tL log 7i and tree excess at most 
t. By construction all components are separated by distance at least 5L log n. Since the tree excess is at 
most t and by construction W has no leaves the largest degree is at most 2t. ■ 

As in LemmafTOlfor u, u' € Vb we write u ~ n' if there is a path connecting u to u' with no two consecutive 
vertices belonging to Vc- For each component Wj of W we define Vj as 

Vj := {ueV :3u' eV,ur^ u' , d{u' , Wj) < L} 
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By construction Wj C Vj and if d{u,Wj) < Llogn then u £ Vj while if d{u,Wj) > 2Llogn then by 
Corollaiy |2]ti ^ Vj. It follows that the components Vj are disjoint and are not adjacent. We will show that 
the components satisfy the hypothesis of the lemma. 

Suppose that there exist two self-avoiding paths uq, ui, . . . ,ui and vq,vi, . . . ,Vm with ui = Vm, uq, vq G 
Wj and ui, . . . ,ui,vi, . . . ,Vm G ~ which are not identical, (i.e. for some i, Ui / Vi). If / + m < 
5L log n then uq,ui, . . . ,ui,vq,vi, . . . ,Vm must contain a loop of length less than 5L log n which could be 
added to W contradicting our assumption. So without loss of generality / > |Llogn. Then there exists u' 
with u' ~ 'w± Llogn ^"^^ d{u' , Wj) < Llogn. Then there exists a path in the equivalence class of u' from 
^^Liogn "-^ ^' ^^^^ length at most Llog?i. Since d{u',w) < L for some w G W there also exists a path 
from u' to win {u : d{u, W) < L} C Vj with length at most L log 7i. Combining these paths there is a path 
from U5j;^iQg„ to w in Vj of length at most 2Llogn. Combining this path with uo,ui,. . . , us^j^g^ we must 

have a loop of length at most |L log n. But this could be an addition to W which is a contradiction. Hence 
for each u £ Vj — Wj there is a unique self-avoiding path from u to Wj in Vj — Wj. It follows that we can 
partition Vj into {Ui} as required. 

Those points in Vb that are not in some Vj can be placed in blocks according to their equivalence class 
from the relation r^. All such extra blocks are trees of maximum diameter L log n. Finally, vertices v £ Vc 
belong to the block defined by u € Vb if {u, v) is an edge E and if no such edge exists t; is a seperate block. 



2.7 Block Relaxation Times 
2.7.1 Colouring Model 

Lemma 13 Suppose that G satisfies equation Q. For sufficiently large q the relaxation times of the Glauber 
dynamics on each of the blocks constructed in Lemma\l2\is bounded by nP . 

Proof: In the blocks Vj which are trees any path is of length at most 20tL log n so 

miVj,v) < -mJVi,20tLlogn) < (l + ^^)-logn. 
a a a 

By Theorem|7]and Lemma|7]the relaxation time is bounded by n^. 

Now consider a block Vj of the second type. We divide Vj into its sub-blocks Ui. Each Ui is a tree and every 
V G dy^ Ui is (c, a, e)-good. Any path in Ui has length at most 2L log n so 

1 '2L S 

m(Ui,Wi) < —ma{Ui,2Llogn) < (IH )— logn. 

a a a 

Then by Theorem |7] and Lemma |7] the relaxation time of the Glauber dynamics on each Ui is bounded by 
In Lemma 17] take q to be large enough so that log A < — 4/L. Then for Wi G Wj, 

sup sup — — < exp( V A'^("'"^)) (24) 

^{d+u.) -.2^ec Pu^yjQ^ {<Wi) = y\a{dy^ Ui)) 



<exp( ^ A^'°s") (25) 

3 

< exp(n-3) (26) 
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so P{a{wi) = x\a{dy,Ui)) > q ^ exp(— n Then by Lemmas |2] and |4] the relaxation time of the block 

dynamics with blocks {Ui} is bounded by n^" . Then by Proposition 3.4 of IITtI we have that the relaxation 
time of the Glauber dynamics on Vj is bounded by n'-'' . ■ 

2.7.2 Hardcore Model 

Lemma 14 Suppose that G satisfies equation For sufficiently small [3 the relaxation times of the 
Glauber dynamics on each of the blocks constructed in Lemma\T2\is bounded by "nP . 

Proof: In the blocks Vj which are trees, any path is of length at most 20tL log n so 

m{Vj,v) < -ma{V,20tLlogn) < (l + ^^)-logn. 
a a a 



n 



c 



By Theorem [8] the relaxation time is bounded by 
Now consider a block Vj of the second type. By Lemmas |2] and [5] the relaxation time of the block dynamics 
with blocks {Ui} is bounded by . Then by Proposition 3.4 of ifTTl we have that the relaxation time of 
the Glauber dynamics on Vj is bounded by n'^. ■ 

2.7.3 Soft Constraints 

Lemma 15 Suppose that G satisfies equation For small ||i?|| the relaxation times of the Glauber 
dynamics on each of the blocks constructed in Lemma\12\is bounded by 'nP . 

Proof: In the blocks Vj which are trees any path is of length at most 20tL log n so 

m{Vj,v) < -mJVj,20tLlogn) < (l + ^^)-logn. 
a a a 



n 



c 



By Theorem |9] the relaxation time is bounded by 
Now consider a block Vj of the second type. Let V- be the block obtained by removing each of the edges in 
the skeleton Wj and let r' be the relaxation time on V-. In the proof of Lemma |9] we showed that removing 
an edge affects the relaxation time by a factor of at most exp(4||//||) so r < n^'^H^H^r'. In V^ each of the 
trees Ui is separated so t' is simply the maximum of the relaxation times of the Ui. By Theorem |9] the 
relaxation time is bounded by nP so each of the Ui are bounded by n*^ so r < n*^. ■ 

2.8 Mixing time of block dynamics 

We use the partition from Lemma [12] as blocks for the block dynamics of the Glauber dynamics. We use 
the method of path coupling to bound the mixing time of the block dynamics. Let dn denote the hamming 
distance of two distributions. Suppose that T C ^ is a tree, let v € be (c, a, e)-good and let ?], r/' be 
two boundary conditions on d^Vj which differ only at v and suppose that p is the only vertex in T adjacent 
to V. We must couple two states cr(T), cr'(T) so that they are distributed as Q and Q' respectively where 
Q{a{T)) = P{a{T)\r]) and Q'{a'{T)) = P{a'{T)\r]'). This can be done as follows. Root T at p and 
let \l denote the parent of n S T. First couple (t{p) and (t'{p) according to their marginal distributions 
P{(j{p)\r]) and Q'{cr'{p)\v') so as to minimize their total variation distance. Proceed inductively down the 
tree by coupling a{u) and a'{u) according to P{a{u)\rj, o-(^)) and P{a'{u)\r], cr'(V)) so as to minimize 
the total variation distance. When cr{'ti) = cr'(V) then a{u) = a'{u). We will show that we can bound the 
expected hamming distance of these coupled distributions. 
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2.8.1 Colouring Model 

Lemma 16 Let T be a tree such that ipiu) = Xltuea+T A'^*^'"'") < efor all u ^ T. If6>0 then for some 
sufficiently large q = q{5,e, A), the above coupling has 

EdH{<T{T),a'{T))<6. 

Proof: Let 7 > such that ip^{v) < 6. For all n G T we have that #{71; € V - T : {w,u) e E} < e/X. 
By Lemma|7]we choose q large enough so that for each u G T and x G C, P{a{u) = x\r]) < 7/2. Then 

dTv{P{<^{u) = ■\'i],a{^)),P{a{u) = ■\r],a'Cu))) < 2ma.xP{a{u) = x\ri) < 7. 

X 

So given that cr(V) and cr'^u) disagree then a{u) and a'{u) disagree with probability at most 7. It 
follows that the probabihty that a{u) and a'{u) disagree is at most 7'^('*'^) and so E'd// (cr (T), cr'(T)) < 
X^ugT ^"^^"'^^ < (f'yiv) < 5 as required. ■ 

Lemma 17 Let Vj be a block constructed from Lemma [72] If v ^ '^^^ V^v' '^^^ boundary condi- 

tions on d'^Vj which differ only at v then for sufficiently large q = q{a, a, t, 6) we can couple colourings 
(7{Vj), cr'{Vj) distributed as P{cr{Vj)\r]) , P{a' (yj)\r]') respectively so that 

EdH{a{Vj),a'{Vj))<S. 

Proof: The case when Vj is a tree follows by Lemma [16] so we consider the blocks of the second type. Let 
V be adjacent to Ui. If (T^{Wj) and a'^{Wj) are two colourings of Wj then by equation ([24] ) 

and so the total variation distance between P{a{Wj)\r]) and the free measure on colourings on Wj is 
0(n~^). It follows that we can couple <j{Wj) and a'{Wj) so that they agree with probability 1 — 0(n~^). 
On the event they disagree there are at most \Vj\ < n disagreements so this event contributes 0{n^^) 
disagreements to the expected value. So now on the event that (7{Wj) = a'{Wj) for all k ^ i we can set 
o'iUk — iwk}) = o"'(C/a; — {wa;}) since they have the same boundary conditions. This just leaves (T(?7i — {u;j}) 
and a'{Ui — {wi}) to be coupled. Now Ui — {wi} is a tree which has every boundary vertex (c, a, e)-good 
except perhaps Wi. Then repeating the argument of Corollary [T] we have that when A = 

i^{u) < A + Yl ^"'^"''"^ < A + e. 
u'ed+u,-{w,} 

Applying Lemma [T6[ to Ui — {wi} completes the result. ■ 

Lemma 18 For large enough q the relaxation time of the block dynamics with blocks {Vj} from LemmaUH 
is 0{n). 

Proof: Choose q large enough so that in Lemma [TT] we can take < c. By the method of path coupling 
described in Section [2.3.2! it is sufficient to show that if cjccTq are two colourings with (i//((To, fTg) = 1 
differing only at v then we can couple one step of the block dynamics so that the new pair cji , a'^ has 

Ed{ai,a[) <l- [i/n 
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for some f3 > 0. Let K be the number of blocks. We couple them as follows. If the block Vj chosen by 
the block dynamics contains v then we set (T{Vj) = cr'{Vj) and have d{ai,a'i) = 1. If the block chosen 
is adjacent to v then we couple Vj according to Lemma [T7] The expected number of new disagreements 
is at most 5. If Vj neither contains nor is adjacent to v then we set a-{Vj) = cr'(Vj) and the number of 
disagreements does not change. Now if v is adjacent to some blocks Vj it must be in the boundary and so 
therefore must be (c, a, e)-good. Since it has degree at most c it is adjacent to at most c blocks so 

Ediaua[) < 1 - 1 + c-| < 1 - 
where (3 = 1 — c5 which completes the proof. ■ 

2.8.2 Hardcore Model 

Lemma 19 Let T be a tree such that i){u) = XlweS+T A'^^"''") < efor all u £ T. If 6 > then there exists 
P* = p*(6, A, e) such that if (3 < (3*, the above coupling has 

EdH{<T{T),a'{T))<6. 

Proof: Let 7 > such that (P'-/{v) < 6. We can choose (3 small enough so that < 7. For all u £ T, 

P{a{u) = < P{(j{u) = l\a{V - {u}) = 0) = ^ < 7. Then 

dTv{P{cy{u) = ■\r^,a{V)),P{a{u) = -j??, < P{a{u) = l\i^,a(u))-P{a{u) = l%a'C^)) < 7. 

So given that and cr'iyi) disagree then cr(u) and (j'{u) disagree with probability at most 7. It 

follows that the probability that (j{u) and a'{u) disagree is at most 7'^("'^) and so EdH{o'{T),a'{T)) < 
ZlugT 7"^^"'^'^ — 9^7 (^) < ^ as required. ■ 

The following results follow similarly to the colouring model. 

Lemma 20 Let Vj be a block constructed from Lemma 172] For 6 > there exists [3* = (3* [a, a, t, 5) such 
that for (3 < (3* ifv € d'^Vj and t], rj are boundary conditions on d^Vj which differ only at v then we can 
couple states cr{Vj), o''{Vj) distributed as P{a{Vj)\r]), P{a' {Vj)\r]') respectively so that 

EdH{a{Vj),a'{Vj))<6. 

Lemma 21 There exists [3* = P*{a, a, t, 6) such that for [3 < (3* the relaxation time of the block dynamics 
with blocks {Vj}from LemmaUTlis 0{n). 

2.8.3 Soft Constraints Model 

Lemma 22 Let T be a tree such that ^{u) = J2w£d+T A'^^"''"^ < efor all u £ T. If S > then there exists 
H* = H*{5, A, e) > such that if \\H\\ < H*, the above coupling has 

EdH{<T{T),a'{T))<6. 

Proof: Let 7 > such that (P"f{v) < 6. Repeating the argument of Lemma |9] we can choose \\H\\ small 
enough so that 

dTv{P{<^{u) = ■\v,(T{\[)),P{a{u) = •|r?,a'(^))) < 7. 
The remainder of the proof follows similarly from Lemma [19] ■ 

The following results follow similarly from the colouring model. 
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Lemma 23 Let Vj be a block constructed from Lemma\12\ For (5 > there exists H* = H*(a, a, t, 6) such 
that for \\H\\ < H* ifv(z d^Vj and ij, rj are boundary conditions on d^Vj which differ only at v then we 
can couple states criVj), cr'{Vj) distributed as P{a{Vj)\'i]), P{a' {Vj)\r]') respectively so that 

EdH{a{V,),a'{V,))<S. 

Lemma 24 There exists H* = H*{a,a,t,S) such that for \\H\\ < H* the relaxation time of the block 
dynamics with blocks {Vj} from LemmaU2\is 0{n). 

2.9 Main Results 

The main results now follows easily using the block dynamics approach of Proposition 3.4 of ifTZl . 
Proof: (Theorem m For large enough q, by Lemma [T8] the relaxation time of the block dynamics of the 
Glauber dynamics on G with blocks {Vj} from Lemma [T2] is 0{n). By Lemma [T3] the relaxation time of 
the Glauber dynamics on each block is bounded by vP . Then by Proposition 3.4 of [17] we have that the 
relaxation time is 0{n^ ^^). There ai^e at most colourings of G so log(l/ miiicr P{(y)) < nlogg so the 
mixing time of the Glauber dynamics is bounded by 0{n'-^ which completes the result. ■ 

The proofs of Theorems |4] and |6] follow similarly. 
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